The Knowledge Gradient for Optimal Learning

نویسنده

  • Warren B. Powell
چکیده

Optimal learning addresses the problem of how to collect information so that it benefits future decisions. For off-line problems, we have to make a series of measurements or observations before choosing a final design or set of parameters; for online problems, we learn from rewards we are receiving, and we want to strike a balance between rewards earned now and better decisions in the future. This chapter reviews these problems, describes optimal and heuristic policies, and shows how to compare competing policies. Then, the presentation focuses on the concept of the knowledge gradient, which guides information collection by maximizing the marginal value of information. We show how this idea can be applied to both online and off-line problems, as well as a broad range of other applications which have not previously yielded to formal techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Parallel Knowledge Gradient Method for Batch Bayesian Optimization

In many applications of black-box optimization, one can evaluate multiple points simultaneously, e.g. when evaluating the performances of several different neural network architectures in a parallel computing environment. In this paper, we develop a novel batch Bayesian optimization algorithm — the parallel knowledge gradient method. By construction, this method provides the one-step Bayes opti...

متن کامل

SIZE AND GEOMETRY OPTIMIZATION OF TRUSSES USING TEACHING-LEARNING-BASED OPTIMIZATION

A novel optimization algorithm named teaching-learning-based optimization (TLBO) algorithm and its implementation procedure were presented in this paper. TLBO is a meta-heuristic method, which simulates the phenomenon in classes. TLBO has two phases: teacher phase and learner phase. Students learn from teachers in teacher phases and obtain knowledge by mutual learning in learner phase. The suit...

متن کامل

The knowledge gradient algorithm for online learning

We derive a one-period look-ahead policy for finiteand infinite-horizon online optimal learning problems with Gaussian rewards. The resulting decision rule easily extends to a variety of settings, including the case where our prior beliefs about the rewards are correlated. Experiments show that the KG policy performs competitively against other learning policies in diverse situations. In the ca...

متن کامل

Finite-time analysis for the knowledge-gradient policy and a new testing environment for optimal learning

We consider two learning scenarios, the offline Bayesian ranking and selection problem with independent normal rewards and the online multi-armed bandit problem. We derive the first finite-time bound of the knowledge-gradient policy for ranking and selection problems under the assumption that the value of information is submodular. We demonstrate submodularity for the two-alternative case and p...

متن کامل

Optimal Placement of Capacitor Banks Using a New Modified Version of Teaching-Learning- Based Optimization Algorithm

Meta-heuristics optimization methods are important techniques for optimal design of the engineering systems. Numerous methods, inspired by different nature phenomena, have been introduced in the literature. A new modified version of Teaching-Learning-Based Optimization (TLBO) Algorithm is introduced in this paper. TLBO, as a parameter free algorithm, is based on the learning procedure of studen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010